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STEM  Degrees:  3  STEM  Participants:  3 

Major  Goals:  The  inability  of  users  to  communicate  secretly  on  online  social  networking  (OSN)  platforms  is  a  key 
obstacle  to  overcome,  if  these  platforms  are  to  be  used  in  the  tactical  world.  While  exclusive  military  networks  such 
as  MilBook  and  Service-Connected  [exist,  they  do  not  support  secret  group  communications.  Furthermore,  access 
to  such  social  networks  via  mobile  platforms  raises  a  series  of  concerns  like  leakage  of  private  data.  Finally,  any 
secret  communications  can  be  blocked  by  censorship  firewalls  that  maintain  state  and  look  for  specific  keywords  or 
features.  In  this  project,  we  try  to  address  all  of  these  issues. 

Accomplishments:  We  have  the  following  contributions,  which  we  describe  in  some  detail  in  the  final  report 
(Further  fine-grained  details  can  be  found  in  the  papers  that  were  published  on  these):  (A)  We  design  a  system 
that  facilitated  in-band  embedding  of  secrets  (limited  in  size)  in  shared  content  on  OSNs;  (B)  We  design  Hermes,  a 
cost-effective  decentralized  OSN  architecture  that  allows  exchange  of  secret  information  among  a  group,  without 
revealing  any  details  with  regards  to  either  group  membership  or  posting  patterns  (C)  We  design  ZapDroid  that 
quarantines  OSN  or  other  applications  on  smartphones  to  reduce  their  attack  surface,  and  thereby  prevent  them 
from  leaking  any  information  that  needs  to  be  secret  and  (D)  We  perform  an  in  depth  measurement  study  that 
characterizes  what  firewalls  such  as  the  Great  Firewall  of  China  might  do  in  order  to  prevent  confidential 
communications,  and  how  to  evade  such  preventive  censorship. 

To  date,  we  have  four  conference  papers  and  one  journal  paper  either  published  or  accepted  for  publication.  They 
are  mostly  in  top  tier  conference  venues  viz.,  IEEE  CNS  2014,  ACM  UbiComp  2015,  SecureComm  2015  and  in 
ACM  IMC  2017,  and  a  top  journal  viz.,  the  IEEE  Transactions  on  Mobile  Computing. 

Training  Opportunities:  The  work  also  supported  multiple  graduate  students.  Graduated  PhD  students  Jianxia 
Ning  is  now  at  Cisco,  Indrajeet  Singh  joined  Akamai,  Masoud  Akhoondi. 
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these  except  one  are  available  on  the  Pis  website.  The  latest  paper  which  will  appear  in  IMC  2017  will  be  made 
available  after  changes  are  made  to  produce  the  camera  ready  version. 
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The  inability  of  users  to  communicate  secretly  on  online  social  networking  (OSN) 
platforms  is  a  key  obstacle  to  overcome,  if  these  platforms  are  to  be  used  in  the  tactical 
world.  While  exclusive  military  networks  such  as  MilBook  [1]  and  Service-Connected 
[2]  exist,  they  do  not  support  secret  group  communications.  Furthermore,  access  to  such 
social  networks  via  mobile  platforms  raises  a  series  of  concerns  like  leakage  of  private 
data.  Finally,  any  secret  communications  can  be  blocked  by  censorship  firewalls  that 
maintain  state  and  look  for  specific  keywords  or  features.  In  this  project,  we  try  to 
address  all  of  these  issues.  We  have  the  following  contributions,  which  we  describe  in 
some  detail  in  the  final  report  (Further  fine-grained  details  can  be  found  in  the  papers  that 
were  published  on  these):  (A)  We  design  a  system  that  facilitated  in-band  embedding  of 
secrets  (limited  in  size)  in  shared  content  on  OSNs;  (B)  We  design  Hermes,  a  cost- 
effective  decentralized  OSN  architecture  that  allows  exchange  of  secret  information 
among  a  group,  without  revealing  any  details  with  regards  to  either  group  membership  or 
posting  patterns  (C)  We  design  ZapDroid  that  quarantines  OSN  or  other  applications  on 
smartphones  to  reduce  their  attack  surface,  and  thereby  prevent  them  from  leaking  any 
information  that  needs  to  be  secret  and  (D)  We  perform  an  in  depth  measurement  study 
that  characterizes  what  firewalls  such  as  the  Great  Firewall  of  China  might  do  in  order  to 
prevent  confidential  communications,  and  how  to  evade  such  preventive  censorship. 

To  date,  we  have  four  conference  papers  and  one  journal  paper  either  published  or 
accepted  for  publication.  They  are  mostly  in  top  tier  conference  venues  viz.,  IEEE  CNS 
2014  [3],  ACM  UbiComp  2015  [4],  SecureComm  2015  [5]  and  in  ACM  IMC  2017  [6], 
and  a  top  journal  viz.,  the  IEEE  Transactions  on  Mobile  Computing  [7]. 

1 .  Secret  Message  Sharing  Using  Online  Social  Media 

In  this  work,  we  undertake  a  study  to  obtain  a  fundamental  understanding  of  the 
challenges  in  creating  a  viable  covert  channel  for  confidential  communications  on  OSNs 
or  other  photo-sharing  sites.  These  challenges  include  the  following.  Eirst,  photo-sharing 
sites  often  process  uploaded  images.  While  some  of  the  processing  functions  are  clearly 
specified  on  the  photo-sharing  sites  (e.g.,  any  photo  exceeding  a  pre-specified  size  limit 
will  be  re-sized),  not  all  such  functions  are  publicly  known.  These  (possibly  unknown) 
processing  functions  often  interfere  with  the  use  of  steganography,  which  we  use  to 
create  the  covert  channel.  Second,  it  is  well  known  that  steganography  does  not  offer 
perfect  secrecy.  Censors  can  try  to  read  the  embedded  message  by  applying  a  variety  of 
extraction  algorithms  on  a  carrier  image.  Thus,  to  prevent  exposure  in  the  rare  cases  of 
interception,  one  will  have  to  encrypt  the  secret  information  embedded  in  the  shared 
photographs.  Encryption  requires  the  establishment  of  secret  keys  between 
communicating  entities,  for  which  prior  work  often  assumes  the  existence  of  an  out-of- 


band  channel.  However,  the  creation  of  sueh  an  out-of-band  channel  is  difficult  because 
phone  calls,  e-mail  exchanges,  and  Internet  communication  may  be  monitored. 


Our  next  goal  is  to  address  the  above  challenges  and  build  a  framework  for  confidential 
communication  on  public  photo-sharing  sites.  Towards  this,  we  make  three  key 
contributions.  First,  to  understand  how  secretly  embedded  messages  are  affected  by 
processing  done  on  photo-sharing  sites,  we  perform  an  in-depth  measurement  study.  We 
analyze  photos  uploaded  on  four  popular  sharing  sites — Google+,  Facebook,  Twitter,  and 
Flickr.  We  consider  both  photos  wherein  secret  information  is  embedded  and  photos 
without  any  such  embedding.  We  observe  that,  while  the  integrity  of  hidden  messages  is 
preserved  on  some  sites  (e.g.,  Google+),  other  sites  (e.g.,  Facebook  and  Flickr)  perform 
various  processing  functions  on  uploaded  images  and  hence  the  extraetion  of  secret 
messages  from  downloaded  images  fails.  Our  study  sheds  light  on  the  processing 
performed  on  different  sites  and  provides  an  understanding  of  why  secret  content  is 
affected. 


Second,  based  on  the  understanding  obtained  above,  we  propose  simple  changes  to  the 
steganographic  encoding  process,  which  ensure  that  unlike  prior  approaches,  the 
embedded  secret  messages  survive  the  image  processing  performed  by  photo-  sharing 
sites.  Speeifically,  unlike  prior  approaehes  that  modify  the  least  significant  bit  (LSB)  of 
the  DCT  co-efficients  of  an  image,  we  propose  to  modify  the  second  least  co-efficient  bit 
(2-LSB);  this  ensures  that  the  secret  message  is  retained  in  spite  of  processing  done  on 
the  OSN  or  photo-sharing  site.  The  robustness  offered  allows  the  usage  of  less  intense 
forward  error  eorrection  codes  (FEC)  thereby  inereasing  the  secret  message  carrying 
capacity  in  an  image. 


Though  simple,  our  approach  is  not  apparent  without  the  detailed  study  on  the  different 
photo-sharing  sites.  Importantly,  this  improved  reliability  does  not  come  at  the  expense  of 
greater  likelihood  of  detection  of  hidden  messages.  We  evaluate  our  approach  by 
applying  two  state-of-the-art  steganalysis  tools  and  observe  that,  for  a  fixed  amount  of 
secret  data,  the  likelihood  of  detecting  secret  information  embedded  with  our  approach  is 
similar  (or  even  lower  in  some  cases)  to  the  probability  of  detection  when  prior 
approaches  for  steganographic  embedding  are  applied  (while  surviving  the  processing 
done  on  the  site).  In  the  table  below  we  show  the  reduction  in  the  FEC  overhead  and  the 
higher  resistanee  to  steganalysis  with  our  2-ESB  approach. 


Table  1:  The  2-LSB  approach  offers  lower  detection  likelihood  and  FEC  overhead  compared  to 
traditional  LSB  schemes. 


Method 

BER 

EEC 

overhead 

Detection  likelihood 
(enseinhle  classifier) 

Detection  likelihood 
(StegAlyzerAS) 

LSB 

0.1.“)  2.39 

0.0 

0.44 

0.69 

LSB  +  2-LSB 

0.08144 

0.0 

0.47 

0.68 

2-LSB 

0.00968 

0.0 

0.50 

0.63 

LSB-tFEC  115,131 

0.09.375 

0.1.3.33 

0.45 

0.2667 

0.48 

0.5714 

0.53 

LSB-H2-LSB-tFEC  II 5.131 

0.02993 

0.1.33.3 

0.50 

0.68 

LSB-t2-LSB-(-EEC  115,111 

0.0 

0.2667 

0.51 

2-LSB-l-FEC  115,131 

0.0 

0.1.3.33 

0.51 

Finally,  as  discussed  above,  encrypting  the  seeretly  embedded  messages  is  a  must. 
Therefore,  to  enable  recipients  of  the  shared  photo  to  extraet  the  raw  data,  a  key  exchange 
between  the  sender  and  recipients  is  essential.  Towards  this,  we  propose  a  protocol  for 
bootstrapping  the  private  communication  without  any  out-of-band  channel  (unlike  what  is 
assumed  in  prior  work).  Our  bootstrapping  phase  uses  the  very  same  ehannel,  i.e., 
uploaded  images,  to  exchange  keys.  The  work  was  published  in  IEEE  CNS  2014  [3]  and 
was  awarded  the  “best  paper  runner  up  award.” 


2.  Reducing  the  attack  surface  of  mobile  applications  to  prevent  leakage  of 
confidential  information  with  ZapDroid 

The  Google  Play  Store  has  more  than  1 .3  million  apps,  and  the  number  of  app  downloads 
is  roughly  1  billion  per  month.  However,  after  users  interact  with  many  such  apps  for  an 
initial  period  following  the  download,  they  almost  never  do  so  again.  Statistics  indicate 
that  for  a  typical  app,  less  than  half  of  the  people  who  downloaded  it  use  it  more  than 
once.  Reports  also  suggest  that  more  than  86  %  of  users  do  not  even  revisit  an  app,  a  day 
after  the  initial  download.  Elninstall  rates  of  apps  however  (longer  term),  of  about  15  to 
18  %  are  considered  high.  This  means  that  users  often  leave  installed  apps  on  their 
phones.  Many  of  these  applications  are  soeial  network  applications.  Elsers  install  portals 
to  OSNs,  many  times  to  never  use  them  again. 

More  generally,  users  may  only  interact  with  some  downloaded  apps  or  OSN  portals 
infrequently  (i.e.,  not  use  them  for  prolonged  periods).  These  apps  continue  to  operate  in 
the  background  and  have  significant  negative  effects  (e.g.,  leak  private  information  or 
significantly  tax  resources  sueh  as  the  battery).  EInfortunately,  users  are  often  unaware  of 
sueh  app  activities.  We  call  such  seldom-used  apps,  which  indulge  in  undesired  activities, 
“zombie  apps.” 

In  this  work,  we  seek  to  build  a  framework,  ZapDroid,  to  identify  and  subsequently 
quarantine  such  zombie  apps  to  stop  their  undesired  activities.  Since  a  user  ean  change 
her  mind  about  whether  or  not  to  use  an  app,  a  zombie  app  must  be  restored  quiekly  if  the 
user  chooses.  The  classifieation  of  an  app  as  a  zombie  app  is  inherently  subjective.  An 
app  unused  for  a  prolonged  period  should  be  elassified  as  a  zombie  app  if  its  resouree 
usage  during  the  period  is  eonsidered  signifieant  and/or  if  its  aceess  of  private  data  is 
deemed  serious.  Thus,  instead  of  automatieally  classifying  zombie  apps,  we  seek  to 
empower  the  user  by  exporting  the  information  that  she  would  need  to  make  this 
decision.  Moreover,  the  way  in  which  a  zombie  app  should  be  quarantined  depends  on 
whether  the  user  is  likely  to  want  to  use  the  app  again  in  the  future  (e.g.,  an  OSN  app  that 
the  user  tried  once  and  decided  is  not  interesting  vs.  a  Skype  app  that  the  user  uses 
infrequently).  The  apps  that  a  user  is  likely  to  use  again  fairly  soon  must  not  be  fully 
uninstalled;  real  time  restoration  (when  needed)  may  be  diffieult  if  there  is  no  good 
network  connectivity.  We  seek  to  enable  users  to  deal  with  these  different  seenarios 
appropriately. 


Challenges:  We  address  many  ehallenges  en  route  designing  and  building  ZapDroid. 
First,  to  motivate  the  need  for  ZapDroid,  we  ask  the  question;  “How  often  do  users 
download  apps  and  leave  them  on  their  phones,  and  how  do  these  apps  adversely  affect 
the  user  in  terms  of  consuming  phone  resources  and  privacy  leakage?”  We  address  this 
challenge  via  an  extensive  user  study.  Next,  we  ask  “How  can  we  detect  background  apps 
that  either  consume  high  resources  or  violate  privacy  in  a  lightweight  manner?”  Such 
apps  are  the  candidates  for  being  zombie  apps.  Continuous  app  monitoring  can  be  too 
resource-intensive  to  be  practical.  Further,  application-level  implementations  are 
infeasible  since  Android  does  not  allow  any  app  to  track  the  permission  access  patterns  of 
other  apps.  The  third  challenge  is  to  effectively  quarantine  apps,  i.e.,  “How  can  we  design 
effective  methods  to  ensure  that  zombie  apps  are  quarantined  and  remain  in  that  state 
unless  a  user  wants  them  restored?”  With  current  approaches,  apps’  background  activities 
are  constrained  only  temporarily,  until  they  are  woken  up  due  to  time-outs  or  external 
stimuli.  Finally,  “  How  can  we  restore  previously  quarantined  apps  in  a  timely  way,  even 
under  conditions  of  poor  network  connectivity  (if  the  user  desires)?”  The  restored  app 
must  be  in  the  same  state  that  it  was  in  prior  to  the  quarantine.  Reinstalls  from  the  Google 
Play  Store  can  be  hard  if  network  connectivity  is  poor  and  hence,  should  not  be  invoked 
when  it  is  highly  likely  that  the  user  will  restore  the  app.  Further,  clean  uninstalls  can 
result  in  loss  of  application  state. 

Contributions;  Our  framework,  ZapDroid,  addresses  the  above  challenges  and  allows 
users  to  effectively  man-  age  infrequently  used  apps.  In  designing  and  building  ZapDroid, 
we  make  the  following  contributions. 

Showcase  the  unwanted  behaviors  of  candidate  zombie  apps:  We  conduct  a  month¬ 
long  study  where  we  enlist  80  users  on  Amazon’s  Mechanical  Turk  to  download  an 
app  (TimeUrApps)  we  develop.  TimeUrApps  identifies  (other)  apps  that  have  not 
been  used  for  the  month,  on  the  users’  phones.  Once  we  identify  these  apps,  we 
undertake  an  in-house,  comprehensive  experimental  study  to  understand  their 
behaviors  when  they  are  not  being  actively  used.  We  find  that  a  zombie  app  on  a 
typical  user’s  phone  (the  median  user  in  our  targeted  experiments)  could  consume  as 
much  as  58  MB  of  bandwidth  and  more  than  20%  of  the  total  battery  capacity  in  a 
day.  Further,  many  of  such  apps  access  information  such  as  the  user’s  location  and 
transmit  this  over  the  network.  The  following  figure  shows  some  of  the  permissions 
that  some  of  the  apps  in  our  study  obtain  that  can  leak  private  information  (e.g., 
recorded  audio).  Further,  there  is  no  need  for  these  permissions  to  be  granted  to  these 
apps. 
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Identify  candidate  zombie  apps  that  are  most  detrimental  to  the  user’s  device:  We 
design  meehanisms  that  are  integrated  within  the  Android  OS  (we  make  changes  to 
the  underlying  Android  Framework’s  activity  management,  message  passing,  and 
resource  management  components)  to  track  (i)  a  user’s  inter-  actions  with  the  apps  on 
her  device  to  identify  unused  apps,  and  (ii)  the  resources  consumed  and  the  private 
information  accessed  by  these  apps  to  determine  candidate  zombie  apps,  from  which 
the  user  can  choose  to  quarantine  those  she  considers  to  be  zombie  apps. 

Dynamically  revoke  permissions  from  zombie  apps,  or  offload  them  to  external 
storage:  The  quarantine  module  of  ZapDroid  is  invoked  based  on  user  input.  She  has 
to  categorize  a  zombie  app  as  either  “likely  to  restore”  or  “unlikely  to  restore”;  the 
two  categories  are  quarantined  differently.  For  the  first  category,  only  permissions 
enjoyed  by  the  zombie  app  are  revoked  but  all  relevant  data/binaries  are  stored  on 
the  device  itself.  For  the  second  category,  the  associated  data/binaries  are  removed 
from  the  device  and  user-specific  app  state  is  moved  to  either  the  cloud  or  to  a 
different  device  (a  desktop)  owned  by  the  user;  the  transfers  occur  when  there  is  good 
network  connectivity  (e.g.,  WiFi  coverage  or  a  USB  cable). 

Restore  an  app  with  all  its  permissions  if  the  user  desires:  ZapDroid  restores  a 
zombie  app  on  the  user’s  device  if  she  so  desires.  The  state  of  the  app  is  identical  to 
that  prior  to  the  quarantine.  For  the  “likely  to  restore”  category  of  apps,  the 
restoration  time  is  <  6ms.  For  the  “unlikely  to  restore”  category,  restoration  depends 
on  the  network  connectivity  to  where  the  app  was  stored  during  the  quarantine  and  is 
typically  on  the  order  of  a  few  seconds. 

We  evaluate  ZapDroid  via  extensive  measurements  on  5  different  Android  smartphones 
(from  4  vendors).  We  show  that  the  overhead  of  ZapDroid  is  low  (  <  4%  of  the  battery  is 
consumed  per  day).  We  show  that  ZapDroid  saves  more  than  2x  the  energy  expended  due 
to  zombie  app  activities,  as  compared  to  other  popular  apps  on  the  Google  Play  Store 
used  to  kill  undesired  background  processes;  further,  unlike  these  apps,  it  prevents  access 
to  undesired  permissions  by  the  zombie  apps. 

Note  that  ZapDroid  does  not  require  changes  to  an  external  cloud  store  (for  quarantine  or 
restoration);  all  modifications  are  made  only  in  the  Android  OS.  We  envision  that  the 
features  of  ZapDroid  will  be  useful  in  general,  and  our  hope  is  that  this  could  lead  to  an 
integration  of  the  functions  within  the  Android  OS.  A  preliminary  paper  on  ZapDroid 
appears  in  ACM  UbiComp  2015  [4]  and  an  extended  version  appears  in  the  IEEE 
Transactions  on  Mobile  Computing  [7]. 

3.  Privacy  Preservation  of  Online  Social  Media  Conversations 

Today,  leakage  of  information  from  OSN  servers,  coupled  with  the  need  for  OSN 
providers  to  mine  user  data  (e.g.,  for  targeted  advertisements),  has  concerned  users. 
While  posting  encrypted  data  on  OSNs  can  work  in  theory,  it  compromises  the  profit 
motives  of  an  OSN  if  done  at  scale.  Alternatively,  one  could  share  private  content  with 


OSN  friends  by  storing  data  outside  the  OSN  provider’s  eontrol.  Prior  works  that  follow 
this  approaeh  either  store  private  eontent  in  the  eloud  or  aeross  elient  maehines.  The 
former  simply  leaks  private  information  to  the  eloud  providers  in  lieu  of  the  OSN 
providers,  and  also  inereases  user  eosts.  The  viability  of  an  approaeh  based  on  the  latter 
depends  on  the  availability  of  eonsistent  aeeess  to  elient  maehines. 

In  this  work,  we  design  a  deeentralized  OSN  arehiteeture,  Hermes,  with  eost-effeetive 
privaey  in  mind.  Hermes  seeks  to  ensure  that  both  the  eontent  shared  by  a  user  and  her 
sharing  habits  are  kept  private  from  both  the  OSN  provider  and  undesired  friends.  In 
doing  so,  Hermes  seeks  to  (i)  minimize  the  eosts  borne  by  users,  and  (ii)  preserve  the 
interaetive  and  ehronologieally  eonsistent  eonversational  strueture  offered  by  a 
eentralized  OSN. 

Hermes  uses  three  key  teehniques  to  meet  these  goals.  First,  it  judieiously  eombines  the 
use  of  eompute  and  storage  resourees  in  the  eloud  to  bootstrap  eonversations  assoeiated 
with  newly  shared  eontent.  This  also  supports  the  high  availability  of  the  eontent.  Seeond, 
it  employs  a  novel  eost-effeetive  message  propagation  meehanism  to  enable 
dissemination  of  eomments  in  a  timely  and  eonsistent  manner.  It  identifies  and  purges 
(from  eloud  storage)  eontent  that  has  been  aeeessed  by  all  intended  reeipients.  Lastly,  but 
most  importantly,  Hermes  earefully  orehestrates  how  fake  postings  are  ineluded  in  order 
to  hide  sharing  patterns  from  the  untrusted  eloud  providers  used  to  store  and  propagate 
eontent,  while  minimizing  the  additional  eosts  ineurred  in  doing  so.  A  key  feature  of 
Hermes  is  its  flexibility  in  deployment;  it  ean  either  be  implemented  as  a  stand-alone 
distributed  OSN  or  as  an  add-on  to  today’s  OSNs  like  Faeebook  (while  maintaining  the 
deeentralized  nature  of  eontent  sharing).  To  summarize,  our  eontributions  are: 

Design  of  Hermes:  As  our  primary  eontribution,  we  design  Hermes.  It  utilizes  ex-  tremely 
small  amounts  of  storage,  bandwidth,  and  eomputing  on  the  eloud  to  faeilitate  real-time, 
eonsistent  and  anonymous  exehange  of  private  eontent.  Importantly,  Hermes  ensures  that 
eloud  providers  eannot  diseover  the  users  involved  in  private  eonversations  and  is  robust 
to  the  interseetion  attaek  (where  an  attaeker  ean  eorrelate  the  partieipants  aeross  different 
eonversations). 

Analyzing  OSN  data  to  determine  resouree  requirements:  Based  on  1.8  million  posts 
erawled  from  Faeebook,  we  1)  perform  an  analysis  to  determine  key  parameters  for 
implementing  Hermes,  and  2)  eonduet  realistie  simulations  to  show  that  (a)  Hermes 
effeetively  anonymizes  users’  sharing  patterns  and  (b)  Hermes’s  use  of  eloud  resourees  is 
low  enough  to  faeilitate  its  praetieal  deployment.  Our  analysis  suggests  that,  for  90%  of 
users,  Hermes  would  typieally  require  1)  eloud  storage  of  mueh  less  than  5  MB,  and  2)  a 
eompute  instanee  on  the  eloud  that  is  aetive  for  roughly  4  days  every  month.  This 
eorresponds  to  a  monthly  eost  of  less  than  $5  per  user.  With  this  budget,  Hermes  ensures 
that  eloud  serviee  providers  are  unable  to  guess  the  members  or  the  group  size  of  any 
private  eonversation.  If  the  eloud  provider  attempts  to  randomly  guess  the  group 
members,  it  is  eorreet  less  than  15%  of  the  time. 


Implementation  and  evaluation;  We  implement  a  prototype  of  Hermes  as  a  rudimentary 
add-on  to  Facebook.  Our  evaluations  show  that  Hermes  incurs  low  cost,  and  the  user 
experience,  in  terms  of  delays,  is  similar  to  that  with  Facebook  as  shown  in  the  figures 
below.  This  work  appears  in  SecureComm  2015  [5]. 
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1 .  INTANG:  A  Practical  measurement  based  tool  for  censorship  evasion 

Internet  censorship  and  surveillance  are  prevalent  nowadays.  Censorship  systems  such  as 
the  Great  Fire-  wall  (GFW)  of  China,  have  the  capability  of  analyzing  terabyte-level 
traffic  across  the  country  in  realtime.  Protocols  with  plaintext  (e.g.,  HTTP,  DNS,  IMAP), 
are  directly  subject  to  surveillance  and  manipulation  by  the  governors,  while  protocols 
with  encryption  (e.g.,  SSH,  TLS/SSL,  PPTP/MPPE)  and  Tor,  can  be  identified  via  traffic 
fingerprinting,  leading  to  subsequent  blocking  at  the  IP-level. 

The  key  technology  behind  these  censorship  systems  is  Deep  Packet  Inspection  (DPI), 
which  also  powers  Network  Intrusion  Detection  Systems  (NIDS).  As  previously  reported, 
most  censorship  NIDS  are  deployed  “on-path"  in  the  backbone  and  at  border  routers. 

In  order  to  examine  application-level  payloads,  DPI  techniques  have  to  correctly 
implement  the  underlying  protocols  like  the  TCP  protocol,  which  is  the  cornerstone  of 
today’s  Internet.  Earlier  work  has  shown  that  any  NIDS  is  inherently  incapable  of  always 
reconstructing  a  TCP  stream  the  same  way  as  its  endpoints.  The  root  cause  for  this  is  the 
discrepancies  between  the  implementations  of  the  TCP  (and  possibly  other)  protocol  at 
the  end-host  and  at  the  NIDS.  Even  if  the  NIDS  perfectly  mirrored  the  implementation  of 
one  specific  TCP  implementation,  it  may  still  have  problems  processing  a  stream  of 
packets  generated  by  another  TCP  implementation. 

Because  of  such  ambiguity  in  packets  process,  it  is  possible  for  a  sender  to  send  carefully 
crafted  packets  to  desynchronize  the  TCP  Control  Block  (TCB)  maintained  by  the  NIDS 
with  the  TCB  on  the  receiver  side.  In  some  cases,  the  NIDS  can  even  be  tricked  to 
completely  deactivate  the  TCB  (e.g.,  after  receiving  a  spurious  RST  packet),  effectively 
allowing  an  adversary  to  “manipulate”  the  TCB  on  the  NIDS.  Censorship  monitors  suffer 
from  the  same  fundamental  flaw — a  client  can  evade  censorship  if  the  TCB  on  a 
censorship  monitor  can  be  successfully  desynchronized  with  the  one  on  the  server. 
Different  from  other  censorship  evasion  technologies  such  as  VPN,  Tor,  and  Telex  that 
rely  on  additional  network  infrastructure  (e.g.,  proxy  node),  TCB-manipulation-based 


evasion  techniques  only  require  crafting/manipulating  packets  on  the  client-side  and  can 
potentially  help  all  TCP-based  application-layer  protocols  “stay  under  the  radar”.  Based 
on  this  idea,  some  prior  work  has  explored  several  practical  evasion  techniques  against 
the  GFW,  by  studying  its  behaviors  at  the  TCP  and  HTTP  layers.  The  West  Chamber 
Project  provides  a  practical  tool  that  implemented  a  few  of  evasion  strategies  but  has 
ceased  development  since  2011;  unfortunately  none  of  the  strategies  are  effective  during 
our  measurement.  Besides  these  attempts,  there  is  no  recent  data  point  showing  how  this 
evasion  technique  works  in  the  wild. 

In  this  work,  we  extensively  evaluate  the  TCP-layer  censorship  evasion  against  the  GFW. 
By  testing  from  1 1  vantage  points  inside  China  spread  across  9  cities  (and  3  ISPs),  we  are 
able  to  cover  a  variety  of  network  paths  that  potentially  include  different  types  of  GFW 
devices  and  middleboxes.  We  measure  how  TCB  manipulation  can  help  HTTP,  DNS, 
and  Tor  evade  the  GFW. 

First,  we  measure  how  existing  censorship  evasion  strategies  work  in  practice. 
Interestingly,  we  find  that  most  of  them  no  longer  work  well  due  to  challenges  in  network 
conditions,  interference  from  the  network  middleboxes,  or  more  importantly,  new 
updates  to  the  GFW  (different  from  models  considered  previously).  These  initial 
measurement  results  motivate  us  to  construct  probing  tests  to  infer  the  “new"  updated 
GFW  model.  Finally,  based  on  the  new  GFW  model  and  lessons  learned  from  other 
practical  challenges  in  deploying  TCP-layer  censorship  evasion,  we  develop  a  list  of  new 
evasion  strategies.  Our  measurement  results  show  that  the  new  strategies  have  a  90%  or 
above  evasion  success  rate.  We  also  evaluate  how  these  new  strategies  can  help  HTTP, 
DNS,  Tor,  and  VPN  evade  the  GFW. 

In  addition,  during  the  course  of  our  measurement  study,  we  design  and  implement  a 
censorship  evasion  tool  which  we  call  INTANG,  integrating  all  of  the  censorship  evasion 
strategies  mentioned  in  this  paper  and  is  easily  extensible.  The  tool  requires  zero 
configuration  and  runs  in  the  background  to  help  normal  traffic  evade  censorship.  We 
plan  to  open  source  the  tool,  which  will  support  future  research  in  this  direction. 

We  summarize  our  contributions  as  the  follows: 

We  are  the  first  to  extensively  measure  the  GFW’s  behaviors  with  TCP-layer 
censorship  evasion  techniques. 

We  demonstrate  that  existing  strategies  are  either  not  working  or  are  limited  in 
practice. 

We  develop  an  updated  and  more  comprehensive  model  of  the  GFW  based  on  the 
measurement  results. 


We  propose  new,  measurement-driven  strategies  that  can  bypass  the  new  model. 


We  measure  the  success  rate  of  our  improved  strategy  on  censorship  evasion  for 
HTTP,  DNS,  VPN,  and  Tor.  The  results  show  very  high  success  rates. 

We  develop  a  tool  to  automatically  measure  the  GFW’s  responsiveness,  and  can  also 
be  used  for  censorship  circumvention.  The  tool  is  extensible  as  a  framework  for  the 
integration  of  additional  evasion  strategies  for  future  research. 

Below  we  present  a  table  that  showcases  the  effectiveness  of  INTANG  in 
successfully  evading  GFW.  The  work  will  appear  in  ACM  IMC  2017  [6] 


Vantage  Points 

Strategy 

Min 

Success 

Max 

Avg. 

Min 

Failure  1 
Max 

Avg. 

Min 

Failure  2 
Max 

Avg. 

Impros'ed  TCB  Teardown 

892% 

98.2% 

95.8% 

1.7% 

6.7* 

3.1% 

0.0% 

5.4% 

\.\% 

Inside  China 

Impros'ed  In-order  Data  Os'erlapping 

86.7% 

97.1% 

94.5% 

2.9% 

8.9* 

4.4% 

0.0% 

5.2% 

\.\% 

TCB  Creation  +  Resync/DesyiK 

88.5% 

98.1% 

95.6% 

1.9% 

7.0* 

3.3% 

0.0% 

5.5% 

1.1% 

TCB  Teardown  +  TCB  Re\'ersal 

902% 

98.2% 

96.2% 

1.7% 

5.6* 

2J6% 

0.0% 

5.7% 

1.1% 

INTANG  Performance 

93.7% 

100.0% 

98.3% 

0.0% 

3.0* 

0.9% 

0.0% 

3.5% 

0.6% 

Impros'ed  TCB  Teardown 

8SA% 

92.9% 

89m 

4.6% 

7.6* 

6m 

0.3% 

6.8% 

3.5% 

Outside  China 

In^iros'ed  In-order  Data  Os'erlapping 

89.4% 

96.0% 

92.7% 

1.3% 

6.2* 

3m 

0.6% 

7.0% 

3.7% 

TCB  Creation  +  ResynoDesync 

78.\% 

95.6% 

84m 

2.4% 

18.6% 

12.9% 

0.9% 

4.0% 

2.6% 

TCB  Teardown  +  TCB  Res'ersal 

84.6SS 

93.1% 

89.5% 

5.5% 

8.7* 

7.1% 

0.1% 

7.9% 

3.3% 
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